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Abstract 

We study nonparametric estimation of the distribution function (DF) of a continuous random vari¬ 
able based on a ranked set sampling design using the exponentially tilted (ET) empirical likelihood 
method. We propose ET estimators of the DF and use them to construct new resampling algorithms for 
unbalanced ranked set samples. We explore the properties of the proposed algorithms. For a hypothesis 
testing problem about the underlying population mean, we show that the bootstrap tests based on the 
ET estimators of the DF are asymptotically normal and exhibit a small bias of order 0(n _1 ). We 
illustrate the methods and evaluate the finite sample performance of the algorithms under both perfect 
and imperfect ranking schemes using a real data set and several Monte Carlo simulation studies. We 
compare the performance of the test statistics based on the ET estimators with those based on the 
empirical likelihood estimators. 
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1 Introduction 

Ranked set sampling (RSS) is a powerful and cost-effective data collection technique that is often used 
to collect more representative samples from the underlying population when a small number of sampling 
units can be fairly accurately ordered without taking actual measurements on the variable of interest. RSS 
is most effective when obtaining exact measurement on the variable of interest is very costly, but ranking 
the sampling units is relatively inexpensive. RSS finds applications in industrial statistics, environmental 
and ecological studies as well as medical sciences. For recent overviews of the theory and applications of 
RSS and its variations see Wolfe (2012) and Chen et al. (2004). 

Ranked set samples can be either balanced or unbalanced. An unbalanced ranked set sample (URSS) is 
one in which the ranked order statistics are not quantified the same number of times. To obtain an URSS 
of size n from the underlying population we proceed as follows. Let n sets of sampling units, each of size k, 
be randomly chosen from the population using a simple random sampling (SRS) technique. The units of 
each set are ranked by any means other than the actual quantification of the variable of interest. Finally, 
one and only one unit in each ordered set with a pre-specified rank is measured. Let m r be the number 
of measurements on units with rank r, r € {1, ..., k} such that n = Er=i m r■ Suppose N( r ^ denotes the 
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measurement on the jth unit with rank r. The resulting URSS of size n from the underlying population 
is denoted by X-urss = {Al,... ,X n }, where the elements of the rth row X r = (Xr r )i, X^ 2 ,... ,X^ mr ) 
are independently and identically distributed (i.i.cl.) from F^,r = 1,..., k and is the DF of the rth 
order statistic. Moreover, X^ r y s are independent for r = 1,..., k and j = 1,, m r . Note that if m r = m, 
r = 1,..., k, then URSS reduces to the balanced RSS. The DF of URSS is 

1 k m r k 

Fqn ( t ) = — < t ) = Y , qm r F(r) ft); (1) 

r =1 j =1 r =1 

where n = Yh m r and qm r = m r /n. As it is shown in Chen et al. (2004), when n —> oo, and q m . r —> q r , 
for r = 1,... , k , we have F qn ( t ) — > F q ( t ), where 

k 

Fq ( t ) = y qrF ( r ) ( t )■ (2) 

r =1 

One can easily see that F q ( t ) is not equal to the underlying DF F ( t ), unless q r = 1 / k , r = 1, ..., k , showing 
that the EDF based on the URSS data does not provide a good estimate of the underlying distribution F . 
The properties of the EDF of the balanced and unbalanced RSS are studied in Stokes and Sager (1988) as 
well as Chen et al. (2004). 

In this paper, we use the empirical likelihood method as a nonparametric approach for estimating F . 
To this end, we propose two methods to estimate F using the exponentially tilted (ET) technique. The 
proposed estimators can be used as standard tools for practitioners to estimate the standard error of any 
well-defined statistic based on RSS or URSS data and to make inferences about the characteristics of 
interest of the underlying population. Another interesting problem in this direction is to develop efficient 
resampling techniques for URSS data, as in many cases the exact or the asymptotic distribution of the 
statistics based on URSS data are not available or they are very difficult to obtain (e.g., Chen et al., 2004). 
Akin to the methods of Modarres et al. (2006) and Amiri et al. (2014), the new ET estimators of F are 
used to construct new resampling techniques for URSS data. We study different properties of the proposed 
algorithms. For a hypothesis testing problem, about the underlying population mean, we show that the 
bootstrap tests based on the ET estimators are asymptotically normal and exhibit a small bias of order 
C^nT 1 ) which are desirable properties. 

The outline of the paper is as follows. In Section [2j we present ET estimators of F based on the URSS 
data. Section [3] considers two methods for resampling RSS and URSS data based on the ET estimators 
of F . We provide justifications for validity of these methods for a hypothesis testing problem about the 
population mean. Section [4] describes a simulation study to compare the finite sampling properties of 
the proposed methods with parametric bootstrap and some existing resampling techniques for testing a 
hypothesis about the population mean. We consider both perfect and imperfect ranking scenarios, three 
different distributions and five RSS designs. We compare the performance of our proposed methods with 
the one based on the empirical likelihood method studied in Liu et al. (2009) as well as Baklizi (2009). 
In Section [5j we apply our methods for a testing hypothesis problem using a real data set consisting of 
the birth weight and seven-month weight of 224 lambs along with the mother’s weight at time of mating. 
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Section [6] provides some concluding remarks. 


2 Exponential Tilting of DF 

Exponential tilting of an empirical likelihood is a powerful technique in nonparametric statistical inference. 
The impetus of this approach is the use of the estimated DF subject to some constraints rather than the 
EDF. ET methods find applications in computation of bootstrap tail probabilities (Efron and Tibshirani, 
1993), point estimation (Schennach, 2007), estimation of the spatial quantile regression (Rostov, 2012), 
Bayesian treatment of quantile regression (Schennach, 2005), small area estimation (Chaudhuri and Ghosh, 
2011) and Calibration estimation (Kim, 2010), among others. 

Let X = {AT,..., X n } be a generic sample of size n from F and suppose F n (x) = Y2a =i — x ) is 

the EDF of X which places empirical frequencies (weights) 1/n on each X Consider an estimator F p {x) = 
J2?=i Vi^{Xi < x) of F which assigns weights pi instead of 1/n to each X % . To obtain the ET estimator of F, 
we minimize an aggregated distance between the empirical weights 1 Jn and pi subject to some constraints 
on the pi s. More specifically, one chooses a distance d(F p ,F n ) = d{Pi- ^) and minimizes d(F p ,F n ) 

subject to ^27=1 Pi = 1 an d some other constrains such as g(X, 9q) = Ptfj(X t , do) = 0, using the 

following Lagrangian multiplier method 

n 

d(E p ,E n )-A 5 (X,0 o )-a(^ Pi -l), (3) 

1=1 

where g(X, 6q) is often imposed under the null hypothesis in a testing problem or any other conditions that 
one needs to account for in practice. Note that the minimization in ([3]) can also be done by minimizing the 
distance between F p (x) and any target estimator Fp(x) = Y2i=i P$-(Xi < x) other than the EDF F n (t). 

The choice of the discrepancy function d(-, •) for the aggregated loss d(F p ,F n ) in ([3]) leads to different 
ET estimators of F. Since F n (x) is the nonparametric maximum likelihood estimator of F under the 
Kullback-Leibler distance subject to the restriction Y27=i Pi = -*■> one °ft en uses 

d(F p ,F^ = J2 Pi log(|). 

We propose two ET estimators of F based on URSS data with sample size n = Er=i m r where k is the 
set size. The ET estimators are then used to propose new bootstrapping algorithms from URSS data. 

2.1 Exponential Tilting of All Observations (EAT) 

In this section, we propose our first ET estimator of F which is later used to resample from within each 
row of Xurss = {X(r)j, r = 1,..., k] j = 1,..., m r }. The idea behind the first ET estimator of F, for 
bootstrapping Xurss, is to find an estimator 

k m r 

Fp{x) — ^ ', P( r ).i^-(X( r )j T x), (4) 

r=l j=1 
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subject to the constraints 


k m r k m r 

EE P( r )j = 1 and EE P( r )jX(r)j = XfjRSS, 

r =1 j=1 r= 1 j =1 


(5) 


^ A: m r 

where X URS S = ~ EE X Mr 

71 r=l j=l 

Lemma 1. Let X urss = {X( r )ji r — 1 j = be a URSS sample of size n from the 

underlying population F when the set size is k and X( r \j 6 R is the r-th order statistic in a simple random 
sample of size k from F. The optimum values of P( r )j in @ under the constraints © are given by 

exp(AX (r)j ) . 

P(r)j = k m -, r = 1,..., k] j = 1,... ,m r , (6) 

EE exp(AX( r )j) 

r= 1 j =1 

where A is obtained from ^ =1 Y^jZ\P(r)jX{r)j = Xurss- 
Proof. Using the Lagrange multipliers method, and by minimizing 

k m r / \ k m r k m r 

EEwta m +a<ee P(r)jX( r )j - X UR ss ) + afT.Y.PMi ~ !)> ( 7 ) 

r= 1 j =1 \ / / r=l j =1 r =1 j =1 

with respect to p^j’s, one can easily obtain the optimum values in ([6]). □ 

In Section jij we use F p (x) = Ylr=i Yl'fZi P{r)j^{X( r )j < x) for bootstrapping Xurss instead of the 
commonly used empirical DF. It is worth noting that for hypothesis testing problems about the underlying 
population mean p involving the null hypothesis Hq : p = po, minimization in Q is done subject to the 
condition Sr=i E^i P(r)j-^(r)j = Mo- Using the optimum weights P( r )j from the ET estimate of F, we 
also propose S 2 = Ylr =1 X^j=i P{ r )j(X( r )j — Xurss ) 2 to estimate the population variance a 2 . 

2.2 Exponential Tilting of Rows (EAR) 

By the structure of the URSS data, Xurss , we observe that X^i,..., Xr r ) mr are i.i.d. samples from 
F( r )(-), which is the distribution of the r-th order statistic in a simple random sample of size k from F. 
Since 

k 

r= 1 

the idea behind our next proposed ET estimator of F is to estimate each F^ using Xr r p ,..., Xpy nir . and 
construct an estimator of F by averaging over these estimators using suitable weights obtained from the 
Lagrange multipliers method under some constraints. To this end, we work with an estimator of F of the 
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form 


k 

Fp(t) = ^2p( r )F(r)(t), 

r =1 


( 8 ) 


where F^(t) = X- 1 K x (r)j < *) is the EDF of X (r)1 ,..., X (r)mr . 

Lemma 2. Lef Xurss = {-X7 r )j> r = 1,..., k\ j = 1,..., m r } be a URSS sample of size n from F where 
the set size is k and {X\ r )j->3 = 1,... ,m r } are i.i.d. samples from F^ r \ the DF of the r-th order statistic 
of a simple random sample of size k from F. Then, an optimum estimator of F in the form of ([8]) under 
the constraints Ylr=iP(r) = 1 and = X URSS , where X (r) = X- YlJ=i x {r)j> r = l,...,k, is 

given by 


k ~ m r 

E P( r \ ^ 

- y I (X(r)j < x ) with p( r ) 

m r , 

r =1 3=1 


exp(AX (r) ) 

Er=l eX P( AX (r))’ 


(9) 


where A is obtained from Ylr=iP(r) x (r) = X URSS- 

Proof. The results easily follow using the Lagrange multipliers method and minimizing 


K , \ K 

^P(r) in f YJu ) + ^(y^yP(r) X (r) 

r= 1 x ' ' r =1 


k 

X URSS ) + a (X^( r ) ~ 1 )’ 
r=l 


( 10 ) 


with respect to p^ r ) ■ 


a 


P(r) 

we use -Fp(x) = ^( X (r)j E x ) and propose a new bootstrapping algorithm 

7 Tl r 

r —1 J=1 

to resample from Xurss instead of the commonly used empirical DF. Here again for hypothesis testing 

problems involving Hq : 

k 

the condition E P(r) X (r ) = Po¬ 


ll i Section 


problems involving Hq : p = p^ where p is the population mean, minimization in (10) is done subject to 

k 


r=1 


Remark 1. If for the observed URSS data all the m r s are large enough, then one can use ET estimators 
of Ftp by simply treating x ( r )j’s as a SRS of size m r from F(p and constructing the estimator F(t) = 
k Ylr=l F(r)(t) f or F ■ Here, F^pft) = w j( r )^( x {r)j E i) and Wj(r)s are obtained subject to constraints 

y2J=i w ji r ) = 1 emd y2j=i w j( r ) x (r)j = X^p, for r = 1 using the following Lagrange multipliers 

problems: 


m r f ■( W mr 

Wj ^ hl (j /^) + Ar E W ^ r ) X {r)3 


m r 

X (r)) + a r (^2 wj(r) - 1), r = 1,... ,k. 
3 = 1 


3 Bootstrapping URSS and RSS 

In this section, we propose two new bootstrapping techniques to resample from a balanced or unbalanced 
ranked set sample of size n. The first algorithm is based on the ET estimator of F in Lemma [l] to resample 
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the entire URSS while the second one uses the ET estimator of F in Lemma [2] to resample from within 
each row separately. We note that most of the bootstrap methods developed for RSS are based on the 
EDF and one can easily modify them using ET estimators of F. Monte Carlo simulation studies indicate 
that bootstrapping methods based on the ET estimators of F perform better than their counterparts using 
the EDF. 

3.1 Bootstrapping Algorithm: EAT 

To resample from the ET estimator of F given by 

k m r 

Ep(x) — ^ ^ ^ 'j P(r)j^-{X( r )j 5; x), 
r =1 j =1 

where P{r)j is defined in ([b]) we proceed as follows: 

1. Assign probability P( r )j to each element ^( r ')j of X urss■ 

2. Randomly draw from Xurss according to probabilities {P( r )j}; order them as < 

... < X^ k) and retain X* r)1 = Xp y 

3. Repeat Step 2, for r = 1,..., k and j = 1,, m r to generate a bootstrap URSS j X yy j. 

4. Repeat steps 2-3, B times to obtain the bootstrap samples. 

One can easily validate the use of the ET estimator of F for different bootstrapping purposes. For 
example, suppose we want to carry out a bootstrap test for testing Hq : p = po against H a : p > po, where 
p is the unknown parameter of interest. Using Hall (1992), the Edgeworth expansion of the p -value for 
testing Hq against H a based on a SRS of size mk from the underlying population with the test statistic 

T = fivtx is § iven 

P = P(T>t) = 1 - $(t) - (mfc)“ 1/2 ?(*M*) + O(^), (11) 

where q(-) is a quadratic function and 4>(-) and 0(-) are the standard normal distribution and density 
functions, respectively. We consider the problem for a balanced RSS case, as the following argument 
can also be applied to URSS data with some modifications. Let {Xt r \j, r = 1,... ,k; j = 1,... ,m} be 
a balanced ranked set sample of size mk from the underlying population with mean p. We show that 
the ET bootstrap approximation of the sampling distribution of T is in error by only 1 /mk and the p- 
value obtained through the EAT method has the desirable second order accuracy This is similar to results 
obtained in DiCiccio and Romano (1990). For more details see Efron (1981) and Feuerveger et al. (1999). 

Proposition 1. Suppose {X* ry , r = 1,..., k\ j = 1,..., m} is a bootstrap sample generated from the EAT 
algorithm. Let T* = ^ be the bootstrap test for testing Ho : p = po with p-value P*, where X* 
is the mean of the bootstrap sample obtained form the ET estimator of F and S 2 * = \ Ylr =l ^f*) w ' 1 ^ 1 
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S?\ = -A- Y™ ! (X? ,. - X* A 2 . Then , 

(r) m—1 = ^ ( r)j (r)' ? 

P-P-=0(i). (12) 

where P, given by is the p-value of the usual T-test based on a simple random sample of comparable 
size mk from the underlying population. 


Proof. For simplicity, we write the resampled data as {X*,... , X£ m }. In order to test Hq : p = po, and 
to ensure that the null hypothesis is incorporated into the ET estimator of F, we introduce the Lagrange 
multipliers for the constraints Yl?=iPi = 1 an d = po, where the weights pi are obtained as 


Piiho) 


exp(A(/r 0 )X*) 

E™iexp(A( w )^*)’ 


(13) 


and \(po) is the coefficient calculated from Pi(po)X* = po. One can easily show that X*s are 

generated from 


dF p (x) = e {A(XM) - XMx} dF n {x), 


(14) 


where A(X(p)) = log(^ Yli =l exp(A(p)Xj)). To obtain the ET estimator of F under the null hypothesis 
we must have 


E"=i Xjexp{\(po)Xi) 
YT=i exp{X(p 0 )Xi) 


ho = A'(X(po)) = 


( V* _ v\ — 

Therefore, one can use the bootstrap test statistic T* = y ^LA for testing Ho : p = po where X* is 
the mean of the bootstrap sample obtained form the ET estimator of F and S 2 * = \ J2r=i ^(*) w dh 
Sf*) = m-[ ~ X^r)) 2 - Following Hall (1992) and using the Edgeworth expansion, the p-value 

for testing Ho : p = po against Hq : p > po using the bootstrap test statistic T* is given by 

p * = p{t* > t\F p ) = i - m - +o(^), 


where q is a quadratic function. Now, the results follows from ©• 


□ 


3.2 Bootstrapping Algorithm EAR 

The idea behind this method is to use the ET estimator of F given by 


k ~ m r 

h^ = T,yT,w, r!J < 

r= 1 r j =1 


x 


where p( r ) is defined in ([9]). To this end we proceed as follows: 

1. Assign probabilities p( r ) to each row X r of XursSi r = 1,..., k. 

2. Select a row randomly using p( r ) and select an observation randomly from that row. 
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3. Continue step 2 for k times to obtain k observations. 


4. Order them as X5C < ... < X ^ and retain X?^ = X?, 

5. Perform Steps 2-4 for m r and obtain {XjW,..., X^ m }. 

6 . Perform Steps 2-5 for r = 1,..., k. 

7. Repeat steps 2-6, B times to obtain the bootstrap samples. 

4 Monte Carlo Study 

In this section, we compare the finite sample performance of out nonparametric EAT and EAR resampling 
methods with a parametric bootstrap (PB) procedure. The PB method uses a parametric test (PT) with 
an asymptotic normal distribution to test the hypothesis Hq : fi = h q, where n is the unknown parameter 
of interest and po is a known constant. The resampling is performed using B=500 resamples and the entire 
experiment is then replicated 2000 times. We use several RSS and URSS designs with different sample 
sizes when the set size is chosen to be k = 5. We also conducted unreported simulation studies for other 
values of k and we observed similar performance that we summarize below. 

The RSS designs that we consider are written as D = (mi, m 2 , ■ ■ ■, 777 , 5 ) with nr> = Ylr=i m r- For 
example, the first design is balanced with k = 5 and m r = 5 observations per stratum, which is denoted 
by 

D\ = (5, 5, 5, 5,5) with no 1 = 25. 

Similarly, we define the following designs, 

D '2 = ( 8 ,3,3, 2,4) with no 2 = 25, 

T >3 = (3, 2,5, 8 , 3) with no 3 = 21, 

D 4 = (3,10, 3, 3, 3) with ri £> 4 = 22, 

Z ?5 = (4,2,3,3, 8 ) with no 5 = 24. 

We obtain samples from the Normal(0,l), Logistic(l,l) and Exponential(l) distributions. 

4.1 Testing a hypothesis about the population mean 

We first proceed with the following proposition. 

Proposition 2. Suppose F is the DF of the variable of interest in the underlying population with f x 2 dF(x ) < 
00 . Let F( r j be the EDF of the r th row of a balanced RSS data and /1 represent the population mean. 
Then (i?i,..., with di = la(F^) — converges in distribution to a multivariate normal distri¬ 

bution with the mean vector zero and the covariance matrix £ = diag(a 2 (E(i))/m,..., a 2 (F( k \)/m) where 
= /(X - n {i) ) 2 dF {i) and // (i) = f xdF (i) (x). 
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This proposition suggests to use the following test statistic for testing the hypothesis Hq : fj, = /j,q 


where 


T(X,fi o) 


1 

k 


E 


X (r) ~ MO 

5 


4iv(o,i), 


c2 _ J_ ^ 2 (^(r)) 
/c 2 ' m r 

r=l 


(15) 


(16) 


The test statistic T(X,/j, o), which is approximately Normal(0,1) for large k, is referred to as the PT in 
the rest of the work. Ahn et al. (2014) consider the Welch-type (WT) approximation to the distribution 
T(X,n o), where the degree of freedom of the test can be approximated using 


S 2 = 



(17) 


The nonparametric bootstrap tests using the EAT and EAR methods are conducted based on the following 
steps: 


1. 

2 . 

4. 

5. 

6 . 


Let X be an URSS/RSS sample from F. 


Calculate T = T(X,hq), given in (15), under the null hypothesis i7o : M = Mo- 
Apply each of the resampling procedures on X to obtain Xj* = {XX 
Calculate T b * = T(X£, no), b = 1,..., B. 

Obtain the proportion of rejections via 1 b B —- to estimate the p-value. 


We also performed the desired testing hypothesis using PB by generating URSS samples from Normal(0,l), 
Logistic(l,l) and exponential(l) distributions. To perform PB test we use the following steps (for more 
details on PB method see Efron and Tibshirani (1993)): 


1. Let X be a URSS sample from a distribution Fq where 9 is the unknown parameter and let /r = Eg(X). 

2. Calculate T = T(X,/i o), under the null hypothesis Hq : /x = fiQ. 

3. Estimate 9 from X and take a URSS from Fg, X£ = {X* r ^\b- 

4. Calculate T* = T*{X*,/jl 0 ). 

5. Obtain the proportion of rejections via 1 b B —- to estimate the p-value. 

To conduct the parametric bootstrap we estimated the population mean using the sample mean and 
used (7 = 1. Subsequently, we generated samples from the N(x, 1), Logistic(x, 1) and Exponential(x) 
distributions. Table [I] displays the observed a levels. The parametric bootstrap (PB) method is accurate 
and the estimated a levels are close to the nominal level 0.05. The PT test is liberal and its approximated 
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p-value is higher than the nominal level, specially under exponential distribution. We observe that the 
WT test is a bit conservative under the normal and logistic distributions, i.e., the approximated p- values 
are lower than the nominal level. The observed a levels for EAR follow the PB method closely and they 
are less liberal than the PT under the exponential distribution. 

Table 1: Observed a-levels of the proposed tests for testing Hq : p = 0 under the Normal distribution and 
Hq : p = 1 for the Exponential and Logistic distributions. 




PT 

WT 

EAT 

EAR 

PB 


D\ 

0.062 

0.041 

0.056 

0.052 

0.050 


d 2 

0.078 

0.039 

0.054 

0.056 

0.054 

N(0, 1) 

d 3 

0.072 

0.038 

0.046 

0.047 

0.049 


d 4 

0.071 

0.033 

0.057 

0.058 

0.054 


d 5 

0.064 

0.039 

0.043 

0.045 

0.047 


D\ 

0.107 

0.071 

0.081 

0.080 

0.051 


d 2 

0.133 

0.072 

0.076 

0.079 

0.049 

Exponential (1) 

d 3 

0.132 

0.081 

0.089 

0.090 

0.054 


d 4 

0.131 

0.073 

0.098 

0.094 

0.050 


d 5 

0.098 

0.074 

0.058 

0.055 

0.053 


D\ 

0.052 

0.042 

0.05 

0.051 

0.047 


d 2 

0.076 

0.041 

0.058 

0.059 

0.050 

Logistic (1, 1) 

d 3 

0.065 

0.033 

0.048 

0.050 

0.046 


d 4 

0.068 

0.034 

0.059 

0.057 

0.051 


d 5 

0.059 

0.034 

0.043 

0.044 

0.041 


Table [2] displays the estimated power values under shift alternatives H a : p = po + 5 with 5/0. We 
used 95% percentile bootstrap confidence intervals for p, using EAT and EAR to obtain the power of the 
test statistics at a = 0.05. The entries of these tables are the proportion of times that the bootstrap 
confidence intervals do not cover zero. Compared with PT, both the EAT and EAR methods lead to 
high powers, hence they can be nominated to conduct appropriate tests. The results of other simulation 
studies (not presented here) show similar behavior for other values of k such as k = 2, 3, 8,10. We also 
considered different sample sizes. The better performance of the proposed methods are apparent for small 
and relatively small sample sizes (which often happens in practice for RSS) and they perform similarly 
when the sample size gets very large for a fixed set size. 

4.2 Imperfect ranking 

In this section, we compare the finite sample performance of our proposed bootstrapping techniques with 
the PB under imperfect ranking cases. In order to produce the imperfect URSS/RSS samples, we use 
the model proposed by Dell and Clutter (1972). Let Xup and Xu\j denote the judgment and true order 
statistics, respectively. Suppose 


X[{\j — AT(j)j + eij , eij ~ N(0,a e ), 
where X^j and are independent. 
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Table 2: Power comparison for the proposed tests under location shift. 


Normal dist. Exponential dist. Logistic dist. 


<5 

D 

PT 

WT 

ETA 

ETR 

PB 

PT 

WT 

ETA 

ETR 

PB 

PT 

WT 

ETA 

ETR 

PB 

0.1 

D l 

0.148 

0.097 

0.152 

0.145 

0.138 

0.229 

0.148 

0.222 

0.208 

0.209 

0.076 

0.049 

0.088 

0.088 

0.077 


D-2 

0.143 

0.069 

0.140 

0.142 

0.139 

0.227 

0.093 

0.216 

0.212 

0.208 

0.116 

0.052 

0.118 

0.125 

0.112 


D.3 

0.145 

0.061 

0.147 

0.150 

0.142 

0.255 

0.130 

0.255 

0.241 

0.242 

0.122 

0.037 

0.130 

0.128 

0.120 


D 4 

0.155 

0.057 

0.156 

0.164 

0.149 

0.216 

0.096 

0.216 

0.204 

0.205 

0.112 

0.032 

0.118 

0.120 

0.116 


d 5 

0.141 

0.064 

0.142 

0.141 

0.142 

0.190 

0.143 

0.187 

0.176 

0.164 

0.108 

0.034 

0.106 

0.102 

0.104 

0.2 

D 1 

0.389 

0.297 

0.384 

0.388 

0.382 

0.416 

0.304 

0.412 

0.388 

0.380 

0.162 

0.102 

0.177 

0.184 

0.157 


D-2 

0.340 

0.185 

0.337 

0.344 

0.333 

0.375 

0.180 

0.375 

0.359 

0.347 

0.175 

0.085 

0.183 

0.191 

0.176 


d 3 

0.333 

0.143 

0.339 

0.335 

0.327 

0.405 

0.235 

0.399 

0.385 

0.386 

0.159 

0.057 

0.174 

0.175 

0.158 


Di 

0.336 

0.144 

0.336 

0.337 

0.336 

0.381 

0.172 

0.379 

0.363 

0.360 

0.147 

0.058 

0.155 

0.158 

0.147 


Do 

0.308 

0.168 

0.310 

0.315 

0.312 

0.190 

0.286 

0.187 

0.176 

0.164 

0.137 

0.064 

0.141 

0.134 

0.139 

0.3 

Di 

0.696 

0.600 

0.698 

0.684 

0.694 

0.644 

0.500 

0.650 

0.618 

0.603 

0.294 

0.215 

0.291 

0.302 

0.282 


d 2 

0.571 

0.351 

0.571 

0.569 

0.559 

0.553 

0.292 

0.563 

0.538 

0.517 

0.258 

0.145 

0.261 

0.261 

0.252 


d 3 

0.561 

0.284 

0.564 

0.566 

0.549 

0.604 

0.347 

0.598 

0.581 

0.568 

0.252 

0.093 

0.264 

0.264 

0.249 


Di 

0.569 

0.302 

0.566 

0.565 

0.559 

0.524 

0.281 

0.518 

0.520 

0.501 

0.223 

0.102 

0.229 

0.232 

0.227 


d 5 

0.557 

0.355 

0.549 

0.556 

0.541 

0.640 

0.476 

0.621 

0.592 

0.573 

0.250 

0.129 

0.251 

0.252 

0.243 


Table 3: Observed ct-levels for the proposed tests for testing Ho : n = 0 for normal distribution and 
H 0 : n = 1 for the exponential and logistic distributions, under imperfect ranking. 





o 

II 

b 





CTc = 1 



D 

PT 

ETA 

ETR 

IETA 

IETR 

PT 

ETA 

ETR 

IETA 

IETR 

Normal Distribution 

Di 

0.056 

0.054 

0.054 

0.053 

0.056 

0.069 

0.068 

0.066 

0.066 

0.068 

d 2 

0.072 

0.072 

0.070 

0.070 

0.073 

0.074 

0.077 

0.081 

0.071 

0.077 

d 3 

0.067 

0.066 

0.069 

0.067 

0.067 

0.087 

0.081 

0.079 

0.081 

0.077 

Di 

0.058 

0.057 

0.057 

0.060 

0.056 

0.068 

0.070 

0.066 

0.060 

0.066 

Ds 

0.067 

0.063 

0.067 

0.066 

0.065 

0.067 

0.070 

0.069 

0.069 

0.066 

Exponential Distribution 

Di 

0.073 

0.065 

0.068 

0.068 

0.068 

0.067 

0.059 

0.060 

0.059 

0.056 

d 2 

0.084 

0.076 

0.079 

0.077 

0.078 

0.083 

0.078 

0.075 

0.078 

0.074 

d 3 

0.099 

0.094 

0.094 

0.093 

0.092 

0.063 

0.058 

0.063 

0.053 

0.051 

Di 

0.103 

0.100 

0.099 

0.096 

0.097 

0.076 

0.082 

0.076 

0.068 

0.070 

Ds 

0.078 

0.069 

0.070 

0.069 

0.069 

0.071 

0.067 

0.066 

0.059 

0.065 

Logistic Distribution 

Di 

0.060 

0.061 

0.061 

0.062 

0.064 

0.058 

0.061 

0.061 

0.056 

0.061 

d 2 

0.071 

0.074 

0.074 

0.070 

0.073 

0.075 

0.076 

0.079 

0.076 

0.079 

d 3 

0.077 

0.078 

0.079 

0.081 

0.079 

0.071 

0.072 

0.072 

0.067 

0.071 

Di 

0.078 

0.080 

0.080 

0.080 

0.078 

0.075 

0.079 

0.075 

0.075 

0.076 

Ds 

0.068 

0.069 

0.065 

0.067 

0.067 

0.064 

0.060 

0.063 

0.064 

0.063 
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Table 4: Power comparison for the proposed tests under location shift and imperfect ranking with a e = 0.5. 


Normal dist. Exponential dist. Logistic dist. 


<5 

D 

PT 

EAT 

EAR 

IEAT 

IEAR 

PT 

EAT 

EAR 

IEAT 

IEAR 

PT 

EAT 

EAR 

IEAT 

IEAR 

0.1 

Hi 

0.162 

0.162 

0.163 

0.168 

0.162 

0.218 

0.212 

0.203 

0.208 

0.202 

0.090 

0.099 

0.101 

0.100 

0.105 


d 2 

0.163 

0.161 

0.168 

0.170 

0.174 

0.211 

0.208 

0.201 

0.199 

0.195 

0.105 

0.114 

0.110 

0.112 

0.109 


d 3 

0.143 

0.146 

0.152 

0.143 

0.150 

0.230 

0.223 

0.221 

0.225 

0.225 

0.105 

0.116 

0.116 

0.113 

0.119 


Di 

0.149 

0.155 

0.154 

0.154 

0.161 

0.222 

0.215 

0.212 

0.210 

0.208 

0.105 

0.111 

0.116 

0.112 

0.114 


Ds 

0.158 

0.158 

0.159 

0.155 

0.155 

0.212 

0.190 

0.189 

0.193 

0.190 

0.088 

0.090 

0.093 

0.094 

0.094 

0.2 

Hi 

0.394 

0.397 

0.399 

0.404 

0.40 

0.413 

0.382 

0.381 

0.388 

0.381 

0.160 

0.171 

0.169 

0.171 

0.166 


Do 

0.349 

0.353 

0.355 

0.357 

0.358 

0.379 

0.362 

0.364 

0.355 

0.350 

0.159 

0.171 

0.172 

0.169 

0.172 


d 3 

0.325 

0.340 

0.337 

0.333 

0.340 

0.394 

0.373 

0.378 

0.374 

0.375 

0.152 

0.157 

0.161 

0.156 

0.162 


Dt 

0.332 

0.327 

0.333 

0.328 

0.338 

0.373 

0.354 

0.358 

0.352 

0.351 

0.169 

0.171 

0.176 

0.179 

0.181 


Ds 

0.326 

0.322 

0.328 

0.328 

0.334 

0.412 

0.374 

0.374 

0.372 

0.367 

0.148 

0.156 

0.151 

0.152 

0.153 

0.3 

Di 

0.709 

0.708 

0.704 

0.709 

0.706 

0.643 

0.615 

0.609 

0.607 

0.607 

0.303 

0.310 

0.309 

0.309 

0.312 


d 2 

0.584 

0.588 

0.586 

0.584 

0.588 

0.517 

0.498 

0.498 

0.493 

0.482 

0.259 

0.269 

0.273 

0.277 

0.275 


d 3 

0.556 

0.563 

0.561 

0.557 

0.556 

0.594 

0.571 

0.570 

0.569 

0.566 

0.238 

0.249 

0.254 

0.250 

0.255 


Di 

0.571 

0.570 

0.565 

0.565 

0.569 

0.530 

0.516 

0.518 

0.508 

0.507 

0.247 

0.249 

0.249 

0.248 

0.257 


d 3 

0.558 

0.563 

0.555 

0.556 

0.561 

0.619 

0.572 

0.574 

0.568 

0.563 

0.247 

0.255 

0.252 

0.255 

0.251 


Using imperfect URSS with a e = 0.5 and 1, we report the observed significance levels for testing 
Hq : [i = no against H a : p > p o for different methods in Table [3j These choices of a e resulted in 
the observed correlation coefficients of 0.89 and 0.70 between the ranking variable and the variable of 
interest, respectively. As compared with the results under the perfect ranking assumption, the proposed 
methods seem to be robust with respect to imperfect ranking. It was shown that the test under exponential 
distribution for the imperfect sampling is a bit liberal. We also observe that imperfect ranking affects the 
power of the tests since, as it is shown in Table [4j by adding errors in ranking, the power of the proposed 
tests decreases. The importance of accurate ranking in RSS designs has been mentioned in several works. 
Frey, Ozturk and Deshpande (2007) considered nonparametric tests for the perfect judgment ranking. Li 
and Balakrishnan (2008) proposed several nonparametric tests to investigate perfect ranking assumption. 
Vock and Balakrishnan (2011) suggested a Jonckheere-Terpstra type test statistic for perfect ranking in 
balanced RSS. These tests are further studied by Frey and Wang (2013) and compared with the most 
powerful test. 

In order to derive the theoretical results under the imperfect ranking assumption, one can proceed as 
follow. First, note that under the imperfect ranking the density function of characteristic of interest for 
the unit judged to be ranked r is no longer f: r y We denote this density with /r r i. One approach to derive 
the CDF Fj r ] of the rth judgmental order statistic is to use the following model 

k 

F [r] =^2PsrF(s)(x), (18) 

S=1 

where p sr is the probability that the sth order statistic is judged to have rank r, with Y^=iPsr = 
Ek=l Psr = 1- 

Lemma 3. Suppose the imperfect ranking in the RSS design is such that 

k 

F [r] ( x ) = Psr F {s) ( x ) > Vx E M. 

s=l 
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For the resampling technique EAT (or EAR) under the imperfect ranking assumption, which is denoted by 
IEAR (or IEAR), we have 

sup|F* n> (t)-F(t)|=0, 

teK 

where Ff n> (t) is the EDF of the resulting bootstrap sample. 

Proof. We first note that using the IEAT (or IEAR), we have 

k 

F [r](t) = Y,PsrF(s)(t)’ 

s=l 


where FA (.) and F* r ^ (t) are the EDF of the resulting bootstrap samples under the IEAT (or IEAR) and 
EAT (or EAR), respectively. One can easily show that 


k k 


k k 


A»>(‘) = zY.K\V = rEEwryi) = f E%(‘) = Aw, 


r =1 


r —1 s —1 


s —1 r —1 


s=l 


Hence, we have 


F <n> (t) - F(t) = ( F <n> (t ) - F(t)) + (F n (t) - F{t )) = 0(—), 


and this completes the proof. 


□ 


4.3 Comparison with the empirical likelihood method 

In this section, we compare the performance of the bootstrap tests based on ET estimators of F with the 
one based on the empirical likelihood estimator of F which is already studied in the literature by Baklizi 
(2009) and Liu et al. (2009). Empirical likelihood is an estimation method based on likelihood functions 
without having to specify a parametric family for the observed data. Empirical likelihood methodology 
has become a powerful and widely applicable tool for non-parametric statistical inference and it has been 
used under different sampling designs. For a comprehensive review of the empirical likelihood method and 
some of its variations see Owen (2001). For testing the null hypothesis Hq : /i = /j,q using the empirical 
likelihood estimator of F based on a balanced RSS sample, Baklizi (2009) showed that under the finite 
variance assumption 


Co 1 ( r o) ~t Xii 


(19) 


where 


Co 



- ho) 2 

Er=l <A 2 


and l(fi o) 


{Er=lE7 =1 ( X [r]j ~ h0)} 2 

Er=iE7=i(X[ry ~ ho) 2 


( 20 ) 


However, this is a liberal test for small samples and it does not work for URSS case. Liu et al. (2009) 
proposed to use the empirical likelihood method for RSS data by first averaging the observations of each 
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cycle to construct 


1 

X i = -l^ X (.r)v j = h---,m. 

r =1 

Then, by observing that Xj are i.i.d. samples from F, Liu et al. (2009) constructed the usual empirical 
likelihood estimator of F and used it for a testing hypothesis problem. As we show below this method 
does not perform well, especially for RSS samples when the number of cycles is small. 

The following simulation study shows that using EAR based on the ET estimator of F can be used 
to overcome these difficulties. To this end, we consider a balanced RSS with small sample, i.e., Dq = 
(2, 2, 2,2, 2). Figure 0 shows the Q-Q plots of the p- values based on the EAR algorithm (first column), 
and those proposed by Baklizi (2009) (second column) and Liu et al. (2009) (the third column), respectively 
for the normal distribution when Hq : p = 0 and the exponential and logistic distributions for Hq : p = 1. 

5 Real data application 

In this section, we use a data set containing the birth weight and seven-month weight of 224 lambs along 
with the mother’s weight at time of mating, collected at the Research Farm of Ataturk University, Erzurum, 
Turkey. Jafari Jozani and Johnson (2012) as well as Ozturk and Jafari Jozani (2014) used this data set 
to study the performance of ranked set sampling in estimating the mean, the total values and quantiles of 
the seven-month weight of these lambs. The measurement of the weight of young sheep is usually labor 
intensive due to their active nature, and measurement errors can be inflated due to this activity. However, 
one can easily rank a small number of lambs based on their birth weights or their mother’s weights to 
perform a ranked set sampling design hoping that the RSS sample results in a more representative sample 
from the whole population. Here, we treat these 224 records as our population, with the goal of a testing 
hypothesis problem about the mean of the weight distribution of these 224 lambs at seven-month. We 
consider both perfect and imperfect ranking cases. For the perfect ranking scenario, ranking is done based 
on the weight of lambs at seven-month. For the imperfect ranking, we consider two cases. In the first 
case (Imperfect 1), ranking is done based on the the birth weight of the lambs. The Kendall’s r between 
the seven-month weight and the birth weight is 0.64. In the second case (Imperfect 2), we perform the 
ranking process based on the mother’s weight at time of mating which results in a small Kendal’s r of 0.41 
between the lambs weight at seven-month and mother’s weight at the time of mating. Summary statistics 
for these variables for the underlying population are presented in Table [5j Figure [2] shows the histogram 
of the seven-month weight of these lambs with a kernel density estimator of their weight distribution. We 
also present the scatter plots of the birth weight and mother’s weight of these lambs against their weight 
at seven-months. We observe that there is a stronger association between the seven-month weight and the 
birth weight of these lambs. So, we expect to observe a better results under the Imperfect 1 scenario. 

Table [6] presents the results of the analysis for a testing hypothesis problem to test Hq : p = 28.11 based 
on different RSS sampling designs as in Section [4| Based on the obtained a-level for each sampling design 
under the PT and EAR algorithm we observe that our proposed bootstrap test using the ET estimator 
of the DF shows a satisfactory performance compared with the PT method in both perfect and imperfect 
ranking scenarios. 
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a 


b 

Normal 


c 





a 


b 

Exponential 


c 



Logistic 


Figure 1: The Q-Q plot for the p-values of the proposed test statistic based on the EAR algorithm (first 
column), and those proposed by Baklizi (2009) (second column) and Liu et al. (2009) (the third column), 
respectively for the normal distribution when Hq : fj, = 0 and the exponential and logistic distributions for 
H 0 : £i = 1. 

6 Concluding Remarks 

We propose nonparametric estimators of the cumulative distribution of a continuous random variable using 
the ET empirical likelihood method based on ranked set sampling designs. The ET DF estimators are 
used to construct new resampling techniques for URSS data. We study different properties of the proposed 
algorithms. For a hypothesis testing problem, we show that the bootstrap test based on exponential tilted 
estimators exhibit a small bias of order 0(n _1 ), which is a very desirable property. We compared the 
performance of our proposed techniques with those based on empirical likelihood. The latter are developed 
under the balanced RSS assumption and they are not applicable for URSS situation. The results of the 
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Table 5: Summary statistics for the values of the birth weight and seven-month weight of 224 lambs along 
with the mother’s weight at time of mating, collected at the Research Farm of Ataturk University, Erzurum, 
Turkey 


Variable 

Min 

Q i 

Median 

Mean 

Q 3 

Max 

a 2 

Seven-month weight 

20.30 

25.50 

27.90 

28.11 

31.00 

40.50 

15.21 

Birth weight 

2.50 

3.87 

4.40 

4.36 

4.80 

6.70 

0.63 

Mother’s weight 

42.20 

49.68 

52.30 

52.26 

55.10 

63.70 

19.22 



20 25 30 35 40 

Seventh 




Figure 2: The histogram of the values of seven-month weight of 224 lambs with a kernel density estimator 
of their weight distribution as well as the scatter plots of the birth weight and mother’s weight of these 
lambs against their weight at seven-months. 


Table 6: The values of the observed a-levels for testing Hq : n = 28.11 for the weight distribution of 
a population of 224 lambs based on different perfect and imperfect RSS design using the PT and EAR 
algorithm. 



Method 

£>i 

d 2 

d 3 

D a 

d 5 

Perfect Ranking 

PT 

EAR 

0.062 

0.055 

0.094 

0.052 

0.085 

0.044 

0.076 

0.046 

0.083 

0.047 

Imperfect 1 

PT 

EAR 

0.064 

0.048 

0.082 

0.047 

0.091 

0.052 

0.087 

0.047 

0.094 

0.045 

Imperfect 2 

PT 

EAR 

0.065 

0.048 

0.090 

0.042 

0.086 

0.051 

0.086 

0.043 

0.091 

0.046 


simulation studies as well as a real data application show that the method based on ET estimators of the 
DF perform very well even for moderate or small sample sizes. 
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